GENERATING MARKOV EVOLUTIONARY MATRICES FOR A 

GIVEN BRANCH LENGTH 

MARTA CASANELLAS AND ANNA KEDZIERSKA 



Abstract. Under a markovian evolutionary process, the expected number of 
substitutions per site (also called branch length) that have occurred when a 
sequence has evolved from another according to a transition matrix P can be 
approximated by — j log det P. When the Markov process is assumed to be con- 
tinuous in time, i.e. P = exp Qt it is easy to simulate this evolutionary process 
for a given branch length (this amounts to requiring Q of a certain trace) . For 
the more general case (what we call discrete-time models), it is not trivial to 
generate a substitution matrix P of given determinant (i.e. corresponding to 
a process of given branch length). In this paper we solve this problem for the 
most well-known discrete-time models JC69*, K81*, K80*, SSM and GMM. These 
models lie in the class of nonhomogeneous evolutionary models. For any of 
these models we provide concise algorithms to generate matrices P of given 
determinant. Moreover, in the first four models, our results prove that any of 
these matrices can be generated in this way. Our techniques are mainly based 
on algebraic tools. 



1. Introduction 

Phylogenetic reconstruction methods are usually tested on simulated data, 
i.e. DNA (or protein) sequences that have been randomly generated following 
a molecular evolutionary model on a phylogenetic tree. It is easy to generate a 
random DNA sequence that evolves from a given DNA sequence under a given 
evolutionary model if no more constrains are required: one just needs to give 
random values to the parameters of the model and generate data according to 
the conditional probabilities obtained from the parameters. An extra effort is 
needed if the amount of "substitution events" is fixed; this magnitude is usually 
called the branch length of the edge relating both sequences in the phylogenetic 
tree. 
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We will assume (as it is commonly done) that sites in a DNA sequence 
are independent and identically distributed (iid hypothesis), so that one just 
models the evolution of one site (thought as a random variable taking values in 
{A, C, G, T}). The most common molecular evolutionary models used in phyloge- 
netics are the so-called continuous-time models. In these models, the substitution 
events along an edge e of a rooted phylogenetic tree occur following a continuous- 
time Markov process: there is an instantaneous mutation rate matrix Q (usually 
fixed throughout the tree) that operates at intensity A e and for duration t e so 
that the substitution matrix (or transition matrix) P e equals exp(Q- A e t e ). Among 
them there are the time- reversible models Jukes-Cantor JC69 [JC69], Kimura two- 
parameters K80 [Kim80], Kimura three-parameters K81 [Kim81], HKY [HKY85], 
and GTR [Tav86]. 

In this paper we consider a broader class of evolutionary models, the ( discrete- 
time) Markov models on phylogenetic trees. Briefly, the parameters of these 
models consist of a rooted tree topology, a root distribution, and substitution 
matrices P e on the edges e of the tree whose entries correspond to the conditional 
probabilities P(x\y, e) that a nucleotide y at the parent node of e is substituted 
by nucleotide x at the child node. In particular, there is no instantaneous rate 
matrix fixed for the whole tree in these models, so that they account for what is 
called nonhomogeneous data: different lineages in the tree are allowed to evolve 
at different rates. We refer to [GPS03], [AR04], and [SS03, chapter 8] for a 
mathematical approach to the evolutionary models used in this paper. 

If a DNA sequence has evolved from another according to a substitution 
matrix P e , then the number of substitutions per site that have occurred can be 
approximated by 

(1) J(e) = ~logdet(P e ) 

(see [BH87]). This is usually known as the branch length of edge e measured in the 
expected number of substitutions per site. In the case of stationary continuous- 
time models, it coincides with —jtr(D(irjQ\ e t e ) if P e = exp(Q ■ \ e t e ) and -D(IT) 
is a diagonal matrix with entries corresponding to the stationary distribution II. 

Generating DNA sequences evolving under a stationary continuous-time 
evolutionary model on an edge e with preassigned branch length / and given 
rate matrix Q, is not difficult: according to equation (1) one just needs to take 
A e t e = — 1/ tr(D(U)Q) and follow the usual process to generate a Poisson dis- 
tribution according to these parameters. There are several programs available 
for generating data under most-used continuous-time evolutionary models, for 
example seq-gen [RG97] and evolver in PAML [Yan97]. 
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Here we deal with the problem of generating data evolving under the more 
general discrete-time models when the branch lengths of the tree are fixed. From 
what we have seen above, this problem is equivalent to generate substitution 
matrices P e (belonging to the evolutionary model) with given determinant. As 
the substitution matrices are stochastic matrices, this is not an easy task. We 
solve this problem for the so-called equivariant models JC69*, K81*, K80* and 
SSM ([DK09],[CFS11]), and for the general Markov model GMM ([BH87], [Ste94], 
[AR03]). Models JC69*, K81*, K80* correspond to the discrete-time version of the 
corresponding continuous-time models, and SSM contains HKY as a submodel. Our 
results for the first four models (Propositions 3.1, 4.2, 5.1, and 6.7) are actually 
bidirectional: we provide algorithms for generating any strictly stochastic matrix 
M with determinant equal to a given number K e (0,1), when M is either a 
JC69*, K81*, K80* or SSM matrix. For the most general model GMM we provide a 
way of generating strictly stochastic matrices with determinant equal to K, but 
we are not able to claim whether we produce all of them. We observe that we 
are able to produce matrices that are not a exponential of a real rate matrix (cf. 
Remark 5.5). 

The algorithms proposed in this paper have been implemented in C++ in 
order to generate multiple sequence alignments of DNA data evolving on any 
phylogenetic tree. This work will be presented in a forthcoming paper. Note that 
in [JHA + 03] the authors introduce an algorithm to generate data on quartet trees 
under nonhomogeneous continuous-time models. 



Definition 2.1. A 4 x 4 matrix A with real entries and row sums equal to 1, 



is called a GMM matrix. The GMM matrix above is called a SSM matrix if a 3) i = a 2 ,4, 

^3,2 = ^2,3? a 3,3 — a 2,2, ^3,4 = 0>2,1, ^4,1 = a l,4, ^4,2 = ^1,3, a 4,3 = a l,2, «4,4 = a l,l- 

If moreover a^i = a 2 , 2 , ai, 2 = a>2,i, &i,3 — a 2,4 and 0^4 = a 2) 3, then A is called a 
K81* matrix. If a K81* matrix satisfies ai ;2 = 01,4, then it is called a K80* matrix 
and it is called a JC69* matrix if also a 1:2 = a lj3 . 



2. Preliminaries 
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In other words, a SSM matrix is a matrix of type 
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a K81* matrix is a matrix of type 
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a K80* matrix is a matrix of type 
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with a + b + c + d = 1; 



with a + 2b + c = 1; 



and a JC69* matrix is a matrix of type 
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with a + 3b = I. 



The names of the matrices above come from well known evolutionary mod- 
els: in the stochastic case, GMM is a transition matrix for the general Markov 
model ([BH87], [Ste94], [AR03]), SSM for the strand symmetric model introduced 
in [CS05], K81* for the discrete-time version of Kimura three-parameters model 
[Kim81], K80* for the discrete-time version of Kimura two-parameters model 
[Kim80], and JC69* for the discrete-time version of Jukes-Cantor model [JC69]. 

Definition 2.2. A square matrix A is called a stochastic matrix if it has row 
sums equal to 1 and nonnegative real entries. It is called strictly stochastic if 
moreover all its entries are strictly positive. 



We recall that the determinant of any stochastic matrix has absolute value 
less than or equal to 1 (this is a consequence of Perron- Frobenius theorem). In 
this paper we address the problem of providing stochastic matrices of the above 
shapes with given determinant K e (0, 1). 

Before ending the preliminaries section we want to point out in the lemma 
below that JC69*, K80* and K81* matrices are diagonalizable. 
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Lemma 2.3. Let A 

consider the matrix 



( a b c d \ 
bade 
c d a b 
\d c b a / 



be a K81* matrix (a + b + c + d= 1) and 



/ll 1 1 \ 
1-1-1 1 
1-11-1 

V 1 1 - 1 - 1 / 

Then S^ 1 = and S^AS is a diagonal matrix with diagonal entries {l,a — b — 
c + d,a — b + c — d,a + b — c — d} (in this order). 

Remark 2.4. The change of variables considered in the Proposition above cor- 
responds to the discrete Fourier transform in the setting of [SS05]. 



3. Generating JC69* matrices with given determinant 



Proposition 3.1. Let K e (0, 1) and let 
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a + 3b = 1, 



be a JC69* matrix. Then A is a strictly stochastic matrix with determinant equal 
to K if and only if a = \{l + 3K 1 / 3 ), b = 

Proof. Using Lemma 2.3 we have det A = (^^-) 3 . Therefore, A has determinant 
equal to K if and only if a = \(1 + 3/C 1 / 3 ). Moreover, as K G (0, 1), we obtain 
1 > a > (and so < b = < 1), and we are done. □ 

Therefore we have: 

Algorithm 3.2. (Generation of JC69* matrices with given determinant.) 
Input: K in (0, 1). 

Output: A strictly stochastic JC69* matrix A with determinant K. 
Step 1 : Set a = \(1 + 3A 1 / 3 ), b = 
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Final : Return 



A = 



( a b b b \ 

b a b b 

b b a b 

\b b b a J 



4. Generating K80* matrices with given determinant 

Remark 4.1. As a technical step previous to the generation of K80* matrices 
with given determinant, we consider the polynomial 

p K (x) = -2x 3 + x 2 + K, Ke(0,l), 

and we observe that it has exactly one real root s which lies in (\/K, 1). Indeed, 
the coefficients of Pk{%) have one variation in sign and those of pk{—x) have no 
variation in sign. Therefore, applying Descartes' rule we obtain that Pk{%) has 
exactly one positive root s and no negative roots. Moreover, as if is a constant 
in (0, 1), we have that p K (\J~K) = 2K(l — \/K) is positive and Pk(1) = K — 1 is 
negative, implying that s lies in (V~K, 1). 

Using the formula for the roots of a cubic polynomial we obtain 

s = - + - yi + 54K + 6V3K + 81K 2 + + 5AK - 6V3K + 
6 6 v 6 v 

As a byproduct, the polynomial Pk{— x ) has exactly one real root which 
coincides with — s. 

Proposition 4.2. Let K e (0,1) and let s be the unique real root of Pk{x) = 
—2x 3 + x 2 + K (see Remark 4-1)- Let 

/ a b c b \ 
^ _ babe 

c b a b ' 
y b c b a j 

be a K80* matrix (a + 2b + c = 1), and consider the change of variables a = 
1 — 2(6 + c), j3 — 1 — 46. Then A is a strictly stochastic matrix with determinant 
equal to K if and only if \/K < \a\ < s and (5 = K/a 2 . 

Proof. First we note that the inverse change of variables is b = c = 1+/3 4 ~ 2ct . 
Moreover, a = a — c and j3 = a — 2b + c are the diagonal entries in S _1 AS 
(different than 1) in Lemma 2.3 and therefore det(A) = a 2 /3. 
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=>•) Assume that A is strictly stochastic with determinant K. Then b is 
strictly positive, so that j3 < 1. As K — det(A) = a 2 (3 and j3 < 1, we obtain 
| ct | > v^A 7 . In particular, a^O and we can write f3 = K/a 2 . 

Using the inverse change of variables above and f3 = K/a 2 we have 

3 -K/a 2 -2a , . 

a>0<=^25 + c<l<^> — < 1 <^> p K (-a) > 0. 

As noted in Remark 4.1, Pk{~ has exactly one negative root which equals —s 
and lies in (—1, — y/~K). As Pk{— x) has positive leading term, Pk(— a) > only 
holds if a > —s. 

Similarly, c is strictly positive if and only if pk(cx) > 0. Following an anal- 
ogous argument, we obtain that Pk(&) > if and only if a < s. Putting all 
together we obtain y/K < \a\ < s, as desired. 

<=) Assume that y/K < \a\ < s and (5 = K/a 2 . In particular, we have 
< /3 < j£ — 1 and we obtaing that b = is strictly positive. 

Now, as in the proof of =>•) we have that c > if and only if Pk{ol) > 0. 
And also as above, this happens if and only if a < s. As we assumed \a\ < s, we 
obtain c > 0. 

Lastly, a > if and only of Pk{— oc) > 0, and this holds if and only if a > —s 
(see proof of =>•). As we assumed |a| < s, we get that A is a strictly stochastic 
matrix. 

Moreover, det(A) = a 2 (5 = K as wanted. □ 



Using the previous result, we provide the following algorithm for generating 
strictly stochastic K80* matrices with given determinant K. It is worth pointing 
out that with this algorithm we are generating all K80* strictly stochastic matrices 
with determinant K. 

Algorithm 4.3. (Generation of K80* matrices with given determinant.) 
Input: K in (0, 1). 

Output: A strictly stochastic K80* matrix A with determinant K. 

Step 1 : Compute the unique real root s of Pk{ x ) using Remark 4.1. 

Step 2: Choose a randomly such that y/K < \a\ < s. 

Step 3: Let f3 := K/a 2 , b:=^,c := ±±^, and a := 1 - 2b - c. 
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/ a b c b \ 
babe 
c b a b 
y b c b a J 

5. Generating K81* matrices with given determinant 

Previously to dealing with the case of K81* matrices, for each real number 
K in (0, 1), we let s be the unique positive root of the polynomial 

q K (z) :=z(z + l) 2 -4K. 

Indeed, according to Descartes' rules of signs, this polynomial has at most one 
positive root. Moreover, as qx{K) < and qx{^) > 0, there is exactly one 
positive root s and it lies in (K, 1). Using the formula for the roots of a cubic 
polynomial we obtain 

(2) s = -\-\\J-l -54^ + 6^3^ + 81^-^-1 -5AK -6V3K + 81K 2 . 

Proposition 5.1. Let K G (0, 1) and let s be the unique real root of qx(z) : = 
z{z + l) 2 - AK. Let 

/ a b c d \ 
^_ bade 

c d a b ' 
y d c b a J 

be a K81* matrix (a + 2b + c = 1), and consider the change of variables a = 
1 — 2(6 + c), (3 — 1 — 2(6 + d), 7 = 1 — 2(c + d). Then A is a strictly stochastic 
matrix with determinant equal to K if and only if \a\ G (s, 1), \(3\ G (l\ a \, J\ a \) 
where 




Remark 5.2. As the change of variables above is symmetric in b, c, d, the roles 
of these three variables can be exchanged in the previous Proposition. 

Before proving this Proposition we need the following technical lemma. 



Final : Return 

A := 
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Lemma 5.3. Let K be a real number in (0, 1), let s be the unique positive solution 
to z(z + l) 2 — 4K = 0, and consider the function 

K 

f(x,y) = l-x-y + — 

xy 

defined over M. 2 \ {0}. Given y > 0, we consider the set 

tt y = {xeR\x> 0J(x,y) > 0J(x,-y) > 0J(-x,y) > 0J(-x,-y) > 0}. 

Then fl y is not empty if and only if y > s. Moreover, if x G Vt y and y < 1, then 
x belongs to (I y , J y ) where 



:i-y) 2 + f i + y- J(i-i/) 2 -f 

I y = max <( v — , \ and 



J y = min 



i + y + x/(i-y) 2 -f i-y + J(i- y y + f 



Proof. We fix y > 0, and we view / and g as functions on x. For x > we can 
multiply f,gbjx and define quadratic functions f y (x) := —x 2 + (1 — y)x + K/y 
and g y {x) := x 2 + (1 + y)x + K/y so that x belongs to Vt y if and only if x > 0, 
f y (x) > 0, f- y (x) > 0, g y (x) > and g- y (x) > 0. 

Note that f y has discriminant Ai(y) = (1— y) 2 + — and g y has discriminant 
A 2 (y) = (l+y) 2 -f . 

We observe that Ai(y) > for y > 0. Therefore f y (x) = has two real 
solutions xi )L (y) = - v 7 '^ 1 ^ )^ Xi^{y) = y+ V Al( -vl^ anc [ ^(x) is positive for x 
in (xi^, x^r). Note that y/ Ai(y) > \ 1 — y| for y > 0, so xi yL (y) is negative and 
x i,r(u) is positive. Therefore, for x > and y > 0, /^(a;) is positive if and only 
if x e (0,x liR (y)). 

On the other hand, as f- y has negative leading coefficient, there exists x 
with f- y (x) > if and only if Ai(— y) > 0. Note that Ai(— y) is positive for 
y > if and only if y > s (indeed, Ai(— y) coincides with qK(y)/y)- 

Thus f- y (x) > has a solution for x > 0, if and only if y > s. Now for 
x > 0,y > s, the roots of f- y (x) = are y) and £i,.r(— y). Clearly Xi^—y) 

and Xi ) l(— y) are both positive for y > s. Therefore, for x > and y > 0, we have 
/-j, (a) > if and only if y > s and x G (xi, L (-y), x ljR (-y)). 
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Now we study the positivity of g y (x) for x > 0. Note that g y has discrimi- 
nant Ai(— y). As the leading coefficient of g y is positive, we have that g y (x) > 
for all y < s and x G K (because in this case the discriminant is negative). 

Moreover, if y > s, the real roots of g y (x) = are X2,L(y) = — — ~ an d 

X2,R.(y) = ^ 1+ ^ + ^ Al ^ y \ They are both negative so that g y (—x) is positive for 
all y > s and x > 0. 

We study the positivity of g_ y (x) for x > and y > 0. The discriminant 
of g_ y is Ai(y), and it is positive for y > 0. Then the roots of g_ y are x 2i l(— y) 
and X2,R{—y). For y > we have X2,L(—y) < and X2 t R(—y) > 0, and therefore 
g- y (x) > if and only if x belongs to (x 2t R(—y) 1 +oo). 

Summing up, we have proven that the set Vt y is non-empty if and only if 
y > s. Moreover, in that case, if x belongs to fl y , then x lies in 

(0, x ljR (y)) n (x liL (-y),x ljR (-y)) n (0, +oo) n (x 2 , R (-y), +oo) . 

It is easy to see that xi iR (y) is bigger than X2, R (—y) for y > 0. Therefore 
the intersection of intervals above is equal to 

(xi, L (-y),x ltR (-y)) n (x 2 ,R(-y),zi,ii(y)) • 

The statement of the lemma follows from the following claim. 
Claim: If y < 1, then x 2 , R {-y) < x liR (-y). 
Proof of Claim: This is equivalent to proving 
(3) - v/ATFyT < 2. 

First of all we note that Ai(y) < Ai(-y) if and only if y > ™. As y > 0, 

this holds if and only if y > ^2K. Therefore, for y > V2K, y/A^y)- y/A^-y) 
is negative (and hence < 2.) 

If y < y/2K, we have just seen that \J~A~Jjj) > •^/A 1 (— y). In this case, both 
sides in (3) are positive and hence it is equivalent when raising it to the second 
power: 

A 1 (y) + Ax(-y) - 2 y /A 1 (y)A 1 (-y) < 4. 

As we are assuming y < 1, we have Ai(y) + Ai(— y) — 4 = 2y 2 — 2 < < 
2-y/Ai(y)Ai(— y), as we wanted to prove. □ 

Proof of Proposition 5.1. Taking into account that a — 1 — b — c — d, we note 
that inverse change of variables is a = |(1 + a + (5 + 7), b = \{1 — a — (5 + 7), 
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c— |(1 — a + (3 — 7), d — |(1 + ct — /5 — 7). Observing that a, /3, 7 are the diagonal 
entries in S^AS in Lemma 2.3, we see that det A = a(3j. 

=>•) Assume that A is stochastic with determinant X G (0,1). Then a, 
(3, and 7 are non-zero, and 7 = From the positivity of a, 6, c, d we get that 

1 + a + P + ^X), l- a -P + ^> 0,l-a + (3- g > 0, and 1 + a -/5 - § > 0. 
In terms of Lemma 5.3, these inequalities can be rewritten as 

_ a ) > 0, a) > 0, /(/?, -a) > 0, /(-/?, a) > 0. 

Therefore is an element of Q\ a \, which implies that \a\ > s (see Lemma 5.3). 
Moreover, as a = 1 — 2(5 + d), and b, d > 0, we see that |a| < 1. The result then 
follows from Lemma 5.3. 

<=) Using Lemma 5.3 we see that under these assumptions, Q\ a \ 7^ and \ f3\ 
belongs to Q\ a \. Therefore f(—/3, -a) > 0, f(j3, a) > 0, f(/3, -a) > 0, a) > 

0. As 7 = these inequalities coincide with a > 0, b > 0, c> and d > 0, and 
we are done. □ 

The previous results give us a way of generating any K81* matrix. 

Algorithm 5.4. (Generation of K81* matrices with given determinant.) 

Input: K in (0, 1). 

Output: A strictly stochastic K81* matrix A with determinant K. 

Step 1 : Compute the unique real root s of z(z + l) 2 — 4A^using (2). 
Step 2: Choose a randomly such that 1 > \a\ > s. 
Step 3: Take /3 randomly such that \/3\ belongs to (I\ a \, J\ a \). 
Step 4: Set 7 = J. 

Step 5: Set a = f (1 + a + /3 + 7), 6 = i(l-a-/3 + 7 ), c= + -7), 

d=i(l + a-/3- 7 ). 
Final : Return 

/ a 6 c <i \ 
^ _ bade 
c d a b 
\ d c b a J 

Remark 5.5. The change of variables in Proposition 5.1 diagonalizes the matrix 
to Diag(l, a, (3, 7) (see Lemma 2.3). As we have seen in that proposition, a and 
(3 can be both negative. Therefore, using [Cul66], we observe that the matrices 
produced by the algorithm above are not all of them of type exp(Q) for a real 
matrix Q. 
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6. Generating SSM matrices with given determinant 



Definition 6.1. Let A be a 4 x 4 real matrix. We call F(A) the matrix obtained 
from A after performing the basis change F(A) = S^AS where 

/ 1 -1 \ 
Oil 
1-10 
10 1 



S 



\ 



/ 



When A is a SSM matrix, A can be viewed as an element in Hom G (C 4 , C 4 ) 
where G =< (AT)(CG) > (see [CFS07]). The change of basis above decomposes C 4 
into its isotypic components via the natural linear representation G — > GL(C 4 ). 
This change of basis is also known as the generalized Fourier transform (see 
[CS05]). We have the following fact: 



Lemma 6.2. A 4 x 4 matrix A - 
the following shape: 

( 



F(A) 



V 



A 






is a SSM matrix if and only if F(A) has 
\ 



1 - A 
fx 








a 
P' 






a' 



In this case, A, fx, a, a', (3,f3 f can be written in terms of the entries of A as 
A = ai,i + ai,4, A* = °2,2 + 02,3, ot = 02,2 — 02,3, Oi' = 02,4 — 0,2,1, P = — 01,4, 
and /?' = a 1 3 — a x 2 . Tne inverse change of variables is an = (A + pi)/ 2, a 12 = 
(1 - A - /3'j/2, ax',3 = (1 - A + /3')/2 a M = (A - /3)/2, a 2 ,i = (1 - ^ - «') A 
a 2 ,2 = (A* + a)A a 2 , 3 = {n~ a)/2, a 2)4 = (!-// + a')/2. 



Proof. The matrix -F(A) for a generic matrix A 



is 





/ 01,1 + 01,4 + a 4 ,i + a 4 ,4 

CS2,1 + <22,4 + 03,1 + Cl3,4 


ai,2 + ai,3 + 04,2 + 04,3 

02,2 + a2,3 + «3,2 + 03,3 


ai,2 

«2,2 


— 01,3 + 124,2 — a 4 ,3 

— 0-2,3 + (23,2 — Cl3,3 


ai,4 
a 2 ,4 


— Ol,l — (24,1 + <24,4 

— 02,1 — 03,1 + 123,4 


2 I 


CS2,1 + <22,4 

\ 0,4,1 + 14,4 


— 03,1 — 03,4 

— ai^i — 01,4 


(2 2 ,2 + (22,3 — 0-3,2 — <23,3 
(24,2 + Cl4,3 — a l,2 — Ol,3 


a2,2 
ai,3 


— (22,3 — 123,2 + 013,3 

— Ol,2 + d4,2 — (24,3 


0t2,4 
Ol,l 


— 122,1 + (23,1 — 03,4 

— (2l,4 — (24.1 + <24 ; 4 



If A is a SSM matrix, then 03,1 = 02,4, 03,2 = 02,3, 03,3 = 02,2, «3,4 = 0,2,1, 
04,1 = a± t 4, a^2 = «i,3, «4,3 = 01,2, and 04,4 = ai ;i . Therefore the non-diagonal 
blocks are 0. Moreover, as sums of rows are equal to 1, we have that the entries 
of each row in the upper left block sum to 1: 

T^ 1 ' 1 + °1,4 + a 4,l + ^4,4 + ^1,2 + a l,3 + °4,2 + a 4,3) — 1) 
1 , 

-(a 2 ,i + a 2 ,4 + 03,1 + a 3 ,4 + a 2 , 2 + a 2 ,3 + a 3i2 + a 3)3 ) = I. 
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Conversely, imposing that the entries of non-diagonal blocks in F(A) are 
equal to is equivalent to imposing g^i = 02,4, 0^2 = 02,3, 03^ = 02,2? a 3,4 — a>2,i, 
04,1 = ^1,4, ^4,2 = di,3, ^4,3 = Qi,2) an d °4,4 = a i,i (adding and subtracting certain 
pairs of equations). Moreover, F(A) ltl + F(A) 1:2 = 1 implies that sum of rows 1 
and 4 is equal to 2 (and similar for rows 2 and 3). But we have just seen that 
the set of entries in the first (resp. second) row is equal to the set of entries in 
the forth (resp. third) row, thus the sum of entries in each row is equal to 1. □ 

In the following lemma we characterize the stochasticity of A via F{A). 
Lemma 6.3. A is a strictly stochastic SSM matrix if and only if 

( A 1-A \ 

1 - n /I 

a a' 

\ /3' J 

with A, jj, G (0, 1), \(5\ < X, \(5'\ < 1 - A, |a| < fx, and \a'\ <!-//. 



F(A) 



Proof. If A is a SSM matrix, then 



A 



V 

with a + b + c + d = 1, e + / + g + h 
above with X = a + d, fi = g + f, (3 = 



b 
f 
9 
c 



c 
9 
f 
b 



d \ 

h 



= 1, and by Lemma 6.2, F(A) has the shape 
a — d, j3' = c — b, a = f — g, and a' = h — e. 



If a,b, . . . ,h are strictly positive, then we clearly have A,/i G (0, 1), |a| < 
fi,\a'\ < 1 - //, \/3\ < A, and \(5'\ < 1 - A. 

Conversely, if F(A) is block-diagonal as in the statement of the lemma, 
we know by Lemma 6.2 that A is a SSM matrix with entries as above. As the 
inverse change of variables is a = (A + /3)/2, b = (1 - A - p')/2, c = (1 - A + fi')/2 
d = (A-/3)/2, e = (l-/i-a')/2, / = (n+a)/2, g = {fi-a)/2, h = {l-fi + a')/2, 
then if A, \i lie (0, 1), \a\ < /i, \a'\ < 1 — /i, \j3\ < A, and \(3'\ < 1 — A, we obtain 
that a,b, . . . ,h are strictly positive. □ 

Before stating the main result of this section we introduce some notation 
and we prove a technical result. 

Remark 6.4. Given K G (0, 1), we consider the polynomial tk(z) = z 3 + z — 2K. 
It has a unique positive real root. Indeed, by Descartes' rule of signs we see that 
r K has at most one positive real root. Moreover, as r K (K) is strictly negative 
and r K (l) is strictly positive, there exists exactly one positive root v> of r K {z) 
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and it lies in (K, 1). Using the formula for the roots of a cubic polynomial we 
actually get 

v = -\\J-27K + 3V81AT 2 + 3~ - ^-27^-3^81^2 + 3. 

Definition 6.5. Given K G (0, 1), we consider the polynomial tk{z) = z 3 +z—2K 
and we call u its unique positive root (Remark 6.4). We define 9 as the set of 
points (A,/i) G (0, l) 2 satisfying 



z/ + l<A + /i<2, and |A — fx\ < min < 2 — A — 



A + /i - 1 



Lemma 6.6. Le£ A,/i fre rea/ numbers in (0, 1) wzft A+/i > 1. T/ien (A,/i) belongs 
to i/ and only if 

(4) ^ -(l-A)(l-/i)<A/i. 

A -+- /i — 1 

Proof. As A + /i > 1, we exchange the inequality (4) by the following equivalent 
inequality: 

(5) (A + /x-l)(2A/x + l-A-/x)-X>0. 



We consider the change of variables s := A + t := A — /i (so that A 



s+t 
2 ' 

\i = ^). We observe that A and n lie in (0, 1) if and only if \t\ < s and \t\ < 2 — s. 
As we are assuming A + /i > 1, we have s > 2 — s. Therefore, A,/i are real numbers 
in (0, 1) with A + /i > 1 if and only if |t| < 2 — s. 

In these new variables inequality (5) reads as (s — l)( s2 ~ f2 + 1 — s) — K > 0, 
which is equivalent to 

rfi x -2 . ( g -l)(( g -l) 2 + l)-27f _ r x ( g -l) 

<^=) Let A,;U be real numbers in (0, 1) satisfying A + fi > 1 and (5). Then 
s := A + ji lies in (1,2), \t :— A — //| < 2 — s, and s,t satisfy (6). In particular, 
rjf ^~ 1 - > > 0. As we have s > 1, this inequality is positive if and only if its 
numerator is positive, which holds if and only if s — 1 > vq. Therefore s is in 



u + 1, 2) and \t\ < min I 2 — s, \ I Tk{ ~ s ^ \^ in other words, (A, /i) belongs to 



=4>) Conversely, let (A,/i) G 0. Then, using the change of variables above, 
we have that (s,t) satisfies \t\ < \J TK s' s _^ ■ In particular, (6) is satisfied and 
hence (4) is satisfied as well. □ 
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Proposition 6.7. Given K a real number in (0, 1), we consider the polynomial 
rj^{z) = z 3 + z — IK and let u be its positive real root in (K, 1) (see Remark 
6.4)- We fix two real numbers X, /i in (0, 1) such that A + fi > 1. Then the set 

^ = | (a, (3) e R 2 < a < n, < A, \a(3 - A+ ^_ 1 I < (1 - - //) } 

non-empty if and only if (A,/x) belongs to G. Moreover in this case, (a,/3) 
belongs to fi A(U if and only if a belongs to ^ A+M ~ 1 ^ — - — — , a > 0, and 

/ , ^zt-(1-A)(1-m) \ . f ^ + (l-A)(l-^) 

max < —A, — > < p < mm < A, — 

[ alia 

Proof. =>) If (a,/3) is a point in fi Aj/i , then |a/3 - A+ ^_ 1 1 < (1 - A)(l - //). This 
is equivalent to 

(7) 3^ - (1 - A)(l - „) < cB < ^JL_ + (1 - A)(l - „). 

In particular, as a/3 < A/x, we have 

K 



(l-A)(l-/i)<A/i. 



A + /i- 1 

Hence, using Lemma 6.6 we obtain (A,//) G G. 

Moreover, as \(3\ < A, inequality A+ ^_ 1 — (1 — A)(l — /i) < a (3 implies 
— (1 — A) (1 — < Aa, and therefore a belongs to the interval 

A 

The inequalities on /3 follow directly from (7) and from \f}\ < A. Conversely, if a 
belongs to the above interval, and (3 satisfies 

max < —A, — - > < p < mm < A, — - 

j a j j a 

then inequalities (7) hold and hence (a, (3) lies in Q\^. 

<=) Let (A,/i) be a point in G. In this case (A,//) satisfies (4), and in par- 
ticular, the interval 



16 MARTA CASANELLAS AND ANNA KEDZIERSKA 

is non-empty. We choose a > in this interval. 
Then, the interval 

|^ A^T-(l-A)(l-^) ; ^T + (l-A)(l-/i) \ 
la a I 

is non-empty (the left-hand side numerator is smaller than the right-hand side 
numerator, and the denominator is positive) and its intersection with (—A, A) is 
not empty. Indeed, as a > and a belongs to the interval (8), we have 



a 



< A; 



... _ . , K _, +(1-A)(1-m) . . . . 

moreover —A is less than because this expression is positive. 

Finally, we choose (3 in this intersection of intervals and we obtain a point 
(a,/3) inQ AjM . " □ 

Theorem 6.8. Let K be a real number in (0, 1). 



(a) Let (A,/i) be a point in 0, let (a, (3) be a point in and consider real 
numbers a' and (3' such that 

i-f^ < I/ 3 '! < 1 - A > and 



. , la/3- , 
(l) 



(ii) a' = 

Then, if we consider the change of variables a = (A + (3)/2,b = (1 — A 
(3')/2,c = (l-A + /3')/2 d = (X-f3)/2, e = (l-p-a!)/2, f = {jx + a)/2, 
g = ([/, — a)/2, h = (1 — fj, + a')/2, the matrix 



A 



( a b c d \ 

e f g h 

h g f e 

y d c b a J 



is a strictly stochastic SSM matrix with determinant K, a + d + f + g > 1, 
b 7^ c, and f < g. 
(b) Conversely, let 

( a b c d \ 

e f g h 

h g f e 

y d c b a J 



GENERATING MARKOV EVOLUTIONARY MATRICES FOR A GIVEN BRANCH LENGTH7 



be a strictly stochastic SSM matrix with determinant K and with a + d + 
g + f > 1, b ^ c and f > g. Then F(A) is equal to 

( A 1-A \ 
1 — fi fi 
a a' 

\ o o p> p J 

where (A,//) G 0, (cx,P) G ^a,^; and a', (3' satisfy conditions (i) and (ii) 
stated in (a). 



Remark 6.9. (1) By Proposition 6.7, if (A, fi) is a point in 6, there exists (a, P) G 
£l\ tfi . This implies that \afi — A+ R '_ 1 1 is smaller than (1 — A)(l — fi), and thus the 
interval 

|Q/? " A 



is non-empty. In particular, there exists P' in this interval. Therefore conditions 
(i) and (ii) in Theorem 6.8(a) are not empty. 

(2) Assumptions a + d + g + f > 1, f > g, b ^ c are biologically meaningful: 
the elements in the diagonal of an evolutionary Markov matrix stand for the 
conditional probabilities of no mutation, which are supposed to be much higher 
than the off-diagonal probabilities. It is even reasonable to assume that these 
diagonal entries are greater than 0.5, giving in particular a + d + g + f>l. In 
any case, the result proved above can be easily adapted to the case a+d+g+f < 1 
or / > g (we have not done it here in order to make the paper more readable). 
Note also that any SSM matrix with determinant K and f > g gives rise to a SSM 
matrix with f < g and determinant K by permuting its 1st and 4th rows and its 
2nd and 3rd rows (or columns, if preferred). 

The hypothesis b ^ c was added to simplify the statement of the Theorem 
and can be easily removed. Indeed, a matrix A as in (b) has b = c and determinant 
equal to K if and only if F(A) has P' — and K is equal to (A + fi — l)ap. 
Therefore A is strictly stochastic with determinant K and b = c if and only 



if 



K 



A(A+/i-l) 

a'\ < 1 — ji 



< \a\ < /i, 



K 



a(A+/i-l) ' 



P' = and a' is any number satisfying 



Proof, (a) Let A be defined from A, /i, P, . . . , a as above. Then F(A) is equal to 



B 



1-A 



/ A 

1 — 11 jJ, 



\ 



\ 



a a' 

P' P J 
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We prove that A is a stochastic matrix using Lemma 6.3. 

By hypothesis, (A,/i) G 6 and hence A and // lie in (0,1). Moreover, as 
G we have < a < < A. By assumption (i), < 1 — A is also 

satisfied. It remains to prove that \a'\ < 1 — fi. But this follows from conditions 
(i) and (ii): 

M = ^ <l-n. 

Row sums in A are equal to 1 by definition of a,b, . . . ,h. Moreover, as 
B = F(A) is obtained from A by a basis change, we have that detA = det B 
and it coincides with (A + fi — — Thus, by assumption (ii) we have 

detA = K. 

(b) Lemma 6.2 tells us that F(A) has the shape in the statement of the 
Proposition, and that X — a + d, fj, — g + f,a — f — g, a' = h — e, (3 = a — d, 
and /3' — c — b. By Lemma 6.3 we have that A,/i lie in (0, 1), [ct\ < A, \/3\ < A, 
| a' | < 1 — /i, \/3'\ < 1 — A. Moreover, as we are assuming a + d + g + / > 1, fe^c, 
and / > g, we have A + // > 1, f3' ^ 0, and < a < //. 

On the other hand, detA = implies = (A + /i — l)(a/3 — a' ft') and 
therefore condition (ii) holds. 

The remaining inequality in (i), 

| a/3 x 1 
1 — fj, 

holds because \a'\ satisfies (ii) and \a'\ < 1 — //. 

We prove now that (cc, /3) belongs to fi AM , that is, 



\+n-i i 



< \n 



(9) l^- T-^ J <(l-A)(l-/i). 

A -+- /I — 1 

We have just seen that \f3'\ satisfies condition (i), so 

M-jf^f-fKIW-*.) 

and this last term is < (1 — A)(l — //). Therefore (9) is satisfied. 

Finally, as (a, (3) is a point in A /X , this set is not empty and (A, //) belongs 
to by Proposition 6.7. □ 
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The previous results and their proofs provide the following algorithm for 
generating any SSM matrix 

/ a b c d \ 
e f g h 
h g f e 
y d c b a J 

with a + d + g + f>l, f > g, and b ^ c. 

Algorithm 6.10. (Generation of SSM matrices with given determinant.) 
Input: K in (0, 1). 

Output: A strictly stochastic SSM matrix A with determinant K. 



Step 1 
Step 2 

Step 3 

Step 4 

Step 5 

Step 6 

max 



Compute the unique positive root u Q of r K {z) following Remark 6.4. 
Take s randomly in [i/ + 1,2). 

Take t randomly such that \t\ < min <^2 — s, ^/ r ^~ 1 ) j . 



Set A = s±L anc l /j = 
Take a > randomly in 
Choose j3 randomly such that 

! (l-A)(l-/i) 



A it 1 



-A, 



< j3 < min < A, 



Step 7: Choose randomly such that 



|a/: 



— I 

■u-1 



a/3- . 

Step 8: Set a' := 



i-fi 



< \(3'\ < 1-A. 



, a := (A + /3)/2,6 := (1 - A - /3')/2,c := (1 - A + /3')/2 
d := (A - P)/2, e:=(l-fi- a')/2, f:=(fi + a)/2, g:={n~ a)/2, and 
h := (l-v + a')/2. 
Final : Return 

/ a b c d \ 
e f g h 
h g f e 
y d c b a J 

Remark 6.11. As SSM matrices include K81* matrices, using Remark 5.5 we see 
that there exist matrices produced by the algorithm above that are not of type 
exp(Q). 
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7. Generating GMM matrices with given determinant 

For GMM matrices we do not have such a general result as in the previous 
sections. We do not know how to generate any strictly stochastic GMM matrix, 
but here we explain a way for generating some of them. 

We could obtain a strictly stochastic matrix GMM matrix with determinant 
equal to K by exponentiating a rate matrix (i.e. a matrix with row sums equal to 
and off-diagonal positive entries) with trace equal to log if (cf. [PS05, Theorem 
4.19]). However, not all GMM matrices are of this type (see [Cul66] and Remark 
5.5). We use that the product of two strictly stochastic matrices is again a strictly 
stochastic matrix in order to obtain a broader class of GMM matrices. In fact, we 
multiply a GMM matrix of type exp(Q) with determinant 5 > K by a SSM matrix 
of determinant Kj S. We must admit that we do not know how much larger is this 
class of matrices. The set V of GMM matrices with determinant K corresponds 
to an affine variety of dimension 11. There are 11 free parameters for a rate 
matrix Q with given trace, so the matrices of type exp(Q) lie on a subset of V 
of dimension 11. Therefore the set of matrices produced by the algorithm below 
form a subset of maximum dimension of V, and this subset is larger than the set 
{exp(Q)\Q rate matrix, trQ — K}. 

Algorithm 7.1. (Generation of GMM matrices with given determinant.) 
Input: K in (0, 1). 

Output: A strictly stochastic GMM matrix A with determinant K. 
Step 1 : Take a random number t in (log if, 0). 

Step 2: Generate a random rate matrix Q with nonzero entries and trQ — t. 
Step 3: Compute A = exp(Q). 

Step 4: Following algorithm 6.10, generate a strictly stochastic SSM matrix B 
with determinant equal to K/e*. 
Final: Return A = BA . 
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